prover: GPU compression path + plumbed (gated) GPU aggregation by gbotrel · Pull Request #3041 · Consensys/linea-monorepo

gbotrel · 2026-05-08T19:52:23Z

Summary

Wires GPU acceleration for the compression proof (data-availability-v2)
into the production prover and plumbs in — but does not enable — the GPU
path for the aggregation proof.

Compression: auto-enabled whenever a CUDA device is reachable. Wall-clock on
the reference host drops from ~4:40 (CPU) to ~2:10 per proof (3-proof
batch average 2:10.19 on AWS g7e.8xlarge with one RTX PRO 6000 Blackwell).
Aggregation: GPU code is wired (gpu/plonk2 PI/BW6/BN254, gpu/vortex PI
MiMC + ring-SIS, gpu/quotient) but off by default. Operators must set
LINEA_PROVER_GPU_AGGREGATION=1 to opt in. Production should leave it off
for now.
Controller: when launched on a GPU host, only compression jobs are
accepted. Execution / aggregation / invalidity files are ignored even if
the corresponding Enable* toggles are on, so a GPU host never falls back
to a slow CPU path for non-compression work.

⚠️ DevOps must read this

1. Build flags

The CPU binary is unchanged. The GPU binary requires the cuda build tag and
links against the static libgnark_gpu.a produced from prover/gpu/cuda.

# CPU only
make bin/prover                    # GO_BUILD_TAGS=debug, no CUDA dependency

# GPU
make bin/prover-cuda               # GO_BUILD_TAGS=debug,cuda; needs libgnark_gpu.a
# equivalently:
make GO_BUILD_TAGS=debug,cuda bin/prover

bin/prover-cuda is a new make target. The static library
prover/gpu/cuda/build/libgnark_gpu.a must already exist before linking; the
CMake build is unchanged from the existing gpu/cuda source tree.

2. Host class per job type

Job type	Host class	Activation
Compression (DA-v2)	GPU (g7e.8xlarge, 1× RTX PRO 6000 Blackwell)	Auto-detected — no env var needed
Execution / Limitless	CPU (existing class, unchanged)	Unchanged
Aggregation	CPU (existing class, unchanged)	Stays on CPU; do not set the flag below
Invalidity	CPU (existing class, unchanged)	Unchanged

3. Controller behavior on GPU hosts

cmd/controller checks gpu.HasDevice() at start. When true:

It accepts only *-getZkBlobCompressionProof.json jobs.
EnableExecution, EnableAggregation, EnableInvalidity from the config
are ignored even if they are true.

This is intentional — running the CPU-paths on a GPU host with 32 vCPU would
be much slower than dispatching them to the existing CPU pool.

CPU controllers are unchanged: same Enable* semantics as today.

4. Required runtime env vars

Compression (GPU host):

GOMEMLIMIT=180GiB
GOGC=75
# nothing else; GPU is auto-detected

These two values are baked into the reference run and keep peak Go heap
usage at ~200 GiB on a 249 GiB host without thrashing the GC.

Aggregation (CPU host) — unchanged from origin/main today.

5. Required runtime resources

GPU host (compression): 1× RTX PRO 6000 Blackwell, 96 GiB VRAM. Peak
VRAM usage observed: ~80 GiB. Do not schedule another GPU process on
the same card while a compression proof is in flight.
Disk: the prover-assets 7.1.0/data-availability-v2/ directory must be
present on the host. The canonical SRS is read once per process and
benefits substantially from being in OS page cache; a freshly-booted host
pays ~2 min of cold-cache cost on the first proof. Subsequent runs
hit the table below.
Memory: peak host RSS ~210 GiB (large because GPU pinned-memory
staging buffers are reused across rounds and the gnark Go heap is
intentionally large under GOMEMLIMIT=180GiB).

6. Compression reference numbers (3 sorted requests)

Run	Block range	Wall time	Setup load	Solver	GPU prover	Max RSS	CPU
1	`30388561-30389025`	2:10.41	16.81s	33.12s	1:43.61	200.7 GiB	285%
2	`30389026-30389504`	2:10.21	16.86s	33.11s	1:43.31	200.7 GiB	285%
3	`30389505-30390023`	2:09.96	16.90s	33.08s	1:43.12	200.6 GiB	286%

Average wall time: 2:10.19

Per-phase decomposition (from prover logs): solve 33 s → init GPU
instance ~19 s → MSM commit L,R,O ~4 s → build/iFFT/commit Z ~8 s →
quotient GPU ~25 s → MSM h₁,h₂,h₃ ~4 s → eval+linearize+open Z ~7 s →
batch opening ~4 s.

Raw artifacts under prover/reference-benchmarks/results/2026-05-08-g7e-8xlarge-gpu-compression-final/.

7. Proof-flow summary

Compression (GPU host)
  controller picks up *getZkBlobCompressionProof.json*
  └─ bin/prover prove ...
     ├─ LoadSetup (canonical SRS only — GPU path)         ~17s
     ├─ Solver (gnark constraint system)                  ~33s
     └─ gpu/plonk2/bls12377.GPUProve (BLS12-377 PlonK)   ~1:43

Aggregation (CPU host, unchanged)
  controller picks up *getZkAggregatedProof.json*
  └─ bin/prover prove ...
     ├─ makePiProof  → PI wizard + BLS12-377 PlonK   (CPU)
     ├─ makeBw6Proof → BW6-761 PlonK                 (CPU)
     └─ makeBn254Proof → BN254 emulation PlonK       (CPU)

Aggregation (GPU host, opt-in only — DO NOT ENABLE TODAY)
  Same flow + LINEA_PROVER_GPU_AGGREGATION=1 → all three Plonk phases on
  gpu/plonk2; PI Vortex MiMC and ring-SIS on gpu/vortex; quotient
  evaluation on gpu/quotient.

8. Rollback

The compression GPU path can be disabled at runtime by deploying the
non-cuda bin/prover (or by hiding the GPU device from the prover process,
e.g. CUDA_VISIBLE_DEVICES=""). No code change required — the prover falls
back to gnark's CPU PlonK prover.

The aggregation GPU path is off by default; nothing to roll back unless an
operator explicitly set LINEA_PROVER_GPU_AGGREGATION=1 (just unset it).

Test plan

go build ./... (CPU)
go build -tags cuda,debug ./... (GPU)
go test ./gpu/plonk2/... -tags cuda,debug (per-curve correctness vs gnark CPU reference)
go test ./gpu -tags cuda,debug (device singleton)
go test ./circuits/... ./backend/aggregation/... ./cmd/controller/... (touched packages)
End-to-end compression on provertestdata2 × 3 sorted requests, all valid
Smoke run by devops on a staging GPU host (g7e.8xlarge or equivalent)
Confirm the controller on a CPU host still accepts execution/aggregation jobs
Confirm the controller on a GPU host rejects execution/aggregation jobs

🤖 Generated with Claude Code

Note

High Risk
High risk because it introduces a new GPU-backed proving path (gpu/plonk2 via CGO/CUDA) and refactors setup/SRS loading and PI/quotient/vortex hashing logic; mistakes could cause incorrect proofs, runtime failures, or performance regressions across critical proving flows.

Overview
Enables GPU-accelerated proving for data-availability (compression) by threading a new circuits.WithGPU option through ProveCheck, skipping Lagrange SRS loads when on GPU, and eagerly prefetching setups to reduce wall time.

Plumbs a gated GPU path for aggregation (PI → BW6 → BN254) behind LINEA_PROVER_GPU_AGGREGATION, including GPU-backed PI Vortex (MiMC + ring-SIS) and quotient coset reevaluation (CUDA-tagged implementations with CPU fallbacks).

Adds CUDA build tooling and ergonomics: bin/prover-cuda make target, CUDA typecheck in CI (go vet -tags=cuda ./gpu/...), new gpu/cuda CMake build files, and bumps the prover version/dependencies to support these changes.

^{Reviewed by Cursor Bugbot for commit 066520e. Bugbot is set up for automated code reviews on this repo. Configure here.}

* Compression (data-availability-v2) auto-enables the gpu/plonk2 prover whenever a CUDA device is reachable. Wall-clock on the reference host drops from ~4:40 (CPU) to ~2:10 per proof. * Aggregation GPU plumbing (gpu/plonk2 PI/BW6/BN254 + gpu/vortex PI MiMC and ring-SIS + gpu/quotient) is wired but disabled by default behind $LINEA_PROVER_GPU_AGGREGATION; leave the flag off in production for now. * cmd/controller refuses execution / aggregation / invalidity jobs when a GPU is detected; only compression is accepted on a GPU host. See prover/reference-benchmarks/README.md for the host class, build command, runtime flags and 3-proof compression reference (avg 2:10.19 on AWS g7e.8xlarge with an RTX PRO 6000 Blackwell).

Copilot

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

socket-security · 2026-05-08T19:56:24Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	golang/github.com/ethereum/go-ethereum@v1.17.0 ⏵ v1.17.2
	golang/github.com/prometheus/client_golang@v1.19.1 ⏵ v1.23.2	^-3
	golang/golang.org/x/net@v0.49.0 ⏵ v0.53.0	⁺¹
	golang/github.com/consensys/gnark-crypto@v0.20.2-0.20260402204920-39238e584b99 ⏵ v0.20.2-0.20260504203407-0dce6009ca13	⁺¹
	golang/github.com/consensys/gnark@v0.14.1-0.20260505192735-3460cedcac43 ⏵ v0.14.1-0.20260508134514-a9bb4257c480
	golang/golang.org/x/sys@v0.42.0 ⏵ v0.44.0	⁺¹
	golang/github.com/go-playground/validator/v10@v10.28.0 ⏵ v10.30.2
	golang/github.com/klauspost/compress@v1.18.3 ⏵ v1.18.6	⁺¹
	golang/golang.org/x/exp@v0.0.0-20251219203646-944ab1f22d93 ⏵ v0.0.0-20260410095643-746e56fc9e2f
	golang/github.com/consensys/go-corset@v1.2.14
	golang/github.com/fxamacker/cbor/v2@v2.9.0 ⏵ v2.9.2	⁺¹
	golang/github.com/pierrec/lz4/v4@v4.1.22 ⏵ v4.1.26	⁺¹
	golang/github.com/spf13/viper@v1.19.0 ⏵ v1.21.0
	golang/github.com/dlclark/regexp2@v1.11.2 ⏵ v1.12.0	⁺¹
	golang/golang.org/x/time@v0.9.0 ⏵ v0.15.0

View full report

codecov-commenter · 2026-05-08T19:59:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.84%. Comparing base (a1a9917) to head (066520e).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##               main    #3041   +/-   ##
=========================================
  Coverage     75.84%   75.84%           
  Complexity     6844     6844           
=========================================
  Files          1121     1121           
  Lines         44508    44508           
  Branches       5355     5355           
=========================================
  Hits          33755    33755           
  Misses         9469     9469           
  Partials       1284     1284

Flag	Coverage Δ	*Carryforward flag
hardhat	`96.17% <ø> (ø)`
kotlin	`52.38% <ø> (ø)`	Carriedforward from c379756
lido-governance-monitor	`97.61% <ø> (ø)`	Carriedforward from c379756
linea-native-libs	`90.69% <ø> (ø)`	Carriedforward from c379756
linea-shared-utils	`96.18% <ø> (ø)`	Carriedforward from c379756
native-yield-automation-service	`97.68% <ø> (ø)`	Carriedforward from c379756
postman	`99.92% <ø> (ø)`	Carriedforward from c379756
sdk-core	`98.09% <ø> (ø)`	Carriedforward from c379756
sdk-ethers	`89.83% <ø> (ø)`	Carriedforward from c379756
sdk-viem	`99.45% <ø> (ø)`	Carriedforward from c379756
tracer	`88.56% <ø> (ø)`	Carriedforward from c379756

*This pull request uses carry forward flags. Click here to find out more.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

socket-security · 2026-05-12T14:34:21Z

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action	Severity	Alert (click "▶" to expand/collapse)
Warn		Obfuscated code: golang `github.com/pelletier/go-toml/v2` is 90.0% likely obfuscated Confidence: 0.90 Location: Package overview From: `?` → `golang/github.com/spf13/viper@v1.21.0` → `golang/github.com/pelletier/go-toml/v2@v2.3.1` ℹ Read more on: This package \| This alert \| What is obfuscated code? Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at `support@socket.dev`. Suggestion: Packages should not obfuscate their code. Consider not using packages with obfuscated code. Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment `@SocketSecurity ignore golang/github.com/pelletier/go-toml/v2@v2.3.1`. You can also ignore all packages with `@SocketSecurity ignore-all`. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

- config-mainnet-limitless.toml: restore relative paths (dev-host absolute paths leaked into the committed prod config). - prover-testing.yml: run `go vet -tags=cuda ./gpu/...` in the static check job so CPU refactors that break GPU compilation are caught. vet compiles but does not link, so no CUDA toolchain needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit c379756. Configure here.}

Adds deterministic byte-level parity checks between the GPU plonk prover's Fiat-Shamir helpers and the audited gnark CPU construction. * TestFiatShamirChallengeParity (+ NoBsb22 variant) — replays the four prover challenges (gamma, beta, alpha, zeta) through the GPU's bindPublicData/deriveRandomness helpers and compares each derived fr.Element against an inline reference built directly on gnark-crypto's public fiat-shamir API. The reference mirrors gnark CPU's exact bind order from backend/plonk/{curve}/{prove, verify}.go. * TestFiatShamirBatchOpenParity — exercises gpuBatchOpen's KZG-folding FS instance against gnark-crypto's kzg.BatchOpenSinglePoint on identical synthetic inputs (same polys, digests, claimed values, point, dataTranscript, SRS, and folding hash). When the gamma folding challenge matches byte-for-byte, the quotient commitment H is bit-identical; any FS drift yields a different H. Generated for bn254, bls12377, bw6761 via the existing template pipeline. All 9 tests pass locally on RTX PRO 6000 Blackwell. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 8, 2026 19:52

Copilot AI reviewed May 8, 2026

View reviewed changes

cursor Bot reviewed May 8, 2026

View reviewed changes

Comment thread prover/config/config-mainnet-limitless.toml Outdated

chore(prover): gofmt fixes for gpu packages

4f50bff

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gbotrel requested a review from gusiri May 12, 2026 14:44

cursor Bot reviewed May 12, 2026

View reviewed changes

Comment thread prover/circuits/pi-interconnection/keccak/prover/protocol/compiler/globalcs/quotient.go

gbotrel requested a review from ThomasPiellard May 12, 2026 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prover: GPU compression path + plumbed (gated) GPU aggregation#3041

prover: GPU compression path + plumbed (gated) GPU aggregation#3041
gbotrel wants to merge 4 commits into
mainfrom
prover/gpu-compression

gbotrel commented May 8, 2026 •

edited by cursor Bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

socket-security Bot commented May 8, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented May 8, 2026 •

edited

Loading

Uh oh!

socket-security Bot commented May 12, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gbotrel commented May 8, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

⚠️ DevOps must read this

1. Build flags

2. Host class per job type

3. Controller behavior on GPU hosts

4. Required runtime env vars

5. Required runtime resources

6. Compression reference numbers (3 sorted requests)

7. Proof-flow summary

8. Rollback

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

socket-security Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

socket-security Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gbotrel commented May 8, 2026 •

edited by cursor Bot

Loading

socket-security Bot commented May 8, 2026 •

edited

Loading

codecov-commenter commented May 8, 2026 •

edited

Loading

socket-security Bot commented May 12, 2026 •

edited

Loading